Statistical models for human body pose estimation from videos

نویسنده

  • Tobias Jaeggli
چکیده

To investigate the task of multidimensional continuous inference from video sequences on a concrete example application, we focus on the problem of articulated 3D human tracking from monocular video. This is an interesting topic because of its relevance for biological vision systems, as well as its many applications in various domains. Estimating body pose and motion of humans is a challenging task, with difficulties such as self-occlusions and ambiguities. To account for unresolvable uncertainties of the visual analysis of such footage, we formulate the task as a probabilistic inference problem. The pose estimation and tracking algorithms are based on statistical models that can be automatically learned from a set of example data. Thanks to this architecture, the proposed approaches remain general and can be tailored to a specific task by the choice of training data sets. Prior knowledge can be provided in a flexible and theoretically well-motivated way. First, we propose an approach that is based on a model of the joint probability distribution of body pose and the corresponding human shape, as it can be observed in video images. Both body pose and shape are treated as multivariate random variables, by choosing suitable representations. The statistical model uses a mixture of Gaussian distributions to approximate the density, which enables efficient discriminative inference of body poses from shape descriptors. When additionally taking the unknown image locations of the persons into account, the posterior distributions become non-parametric. Therefore, a hybrid inference scheme based on a Rao-Blackwellised particle filter combines parametric inference with sample based inference. A second approach is based on a generative predictive model of human shape, using nonlinear regression. To enable efficient learning and sample based inference, a low-dimensional embedding of human locomotion is determined, with a nonlinear dynamical model. This method is implemented using Locally Linear Embedding, and Relevance Vector Machines for sparse nonlinear regression. We also propose an integrated formulation of the model, fully based on Gaussian Process regression. The resulting tracking algorithms are tested on realistic video sequences with low resolution and image noise. We present extensions of the framework, for simultaneously tracking multiple persons that occlude each other, and for recognising the performed activity along with the pose estimation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Statistical Approach for Extraction of Moving Human Silhouettes from Videos

Human pose estimation is one of the key problems in computer visionthat has been studied in the recent years. The significance of human pose estimation is in the higher level tasks of understanding human actions applications such as recognition of anomalous actions present in videos and many other related applications. The human poses can be estimated by extracting silhouettes of humans as silh...

متن کامل

A Framework for Human Pose Estimation in Videos

In this paper, we present a method to estimate a sequence of human poses in unconstrained videos. We aim to demonstrate that by using temporal information, the human pose estimation results can be improved over image based pose estimation methods. In contrast to the commonly employed graph optimization formulation, which is NP-hard and needs approximate solutions, we formulate this problem into...

متن کامل

Pose Estimation and Tracking of Eating Persons in Real-life Settings

We present an approach to estimate and track 2D upper body poses of persons who are having a meal in videos with highly challenging uncontrolled imaging conditions. We employ a probabilistic model that represents the body as a kinematic tree, and perform inference in this kinematic tree model using particle ltering, and also estimates self-occlusions. Our approach is evaluated with 7 di erent v...

متن کامل

تخمین چنددوربینی حالت سه بعدی انسان با برازش افکنش مدل اسکلت سه بعدی مفصل دار در تصاویر سایه نما

Automatic capture and analysis of human motion, based on images or video is important issue in computer vision due to the vast number of applications in animation, surveillance, biomechanics, Human Computer Interaction, entertainment and game industry. In these applications, it is clear that 3D human pose estimation is an essential part. Therefore, its accuracy has a great effect on the perform...

متن کامل

Pose Estimation of Players in Hockey Videos using Convolutional Neural Networks

Traditional hockey scouting procedures for evaluating player performance is based on visual monitoring of hockey videos and statistics. However, that evaluation is time consuming and prone to human bias. In addition, current research within hockey analytics quantifies player performances by employing statistical models on common hockey statistics. To improve statistical models and increase the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008